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Abstract 

Reduction of costs in biological signalling seems an evo- 
lutionary advantage, but recent experiments have shown sig- 
nalling codes shifted to signals of high cost with an underutil- 
isation of low cost signals. Here I derive a theory for efficient 
signalling that includes both errors and costs as constraints 
and I show that errors in the efficient translation of biological 
states into signals can shift codes to higher costs, effectively 
performing a quality control. The statistical structure of 
signal usage is predicted to be of a generalised Boltzmann 
form that penalises signals that are costly and sensitive to 
errors. This predicted distribution of signal usage against 
signal cost has two main features: an exponential tail re- 
quired for cost efficiency and an underutilisation of the low 
cost signals required to protect the signalling quality from the 
errors. These predictions are shown to correspond quantita- 
tively to the experiments in which gathering signal statistics 
is feasible as in visual cortex neurons. 
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1 Introduction 

Cells, groups of cells and multicellular organisms communicate their states 
using signals. The types of signals and encoding mechanisms used can be 
very different but, irrespectively of the mechanism, signal transmission should 
have a high efficiency within biological constraints. A universal constraint 
is the signalling cost. Have biological signalling codes evolved to minimise 
cost? Cost reduction seems advantageous (??????) but signalling systems 
might be simultaneously optimal not only respect to cost but also to other 
constraints resulting in signalling codes very different to the cost efficient 
ones. A second universal constraint is communication errors. Here I consider 
the extension of information theory (??) to include errors and cost together 
as constraints of signalling systems and find the optimal signal usage under 
these constraints. For clarity of exposition and because the best data sets for 
statistical analysis are in neural signals, I will particularise the discussion to 
cell signaUing and discuss the relevance of results to other signalling systems 
afterwards. 

Neurons provide an experimentally tractable case of cell signalling. The 
experimental evidence in neurons is counterintuitive. Neurons codes can un- 
derutilise low cost signals. For neurons using different spike rates as signals, 
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it has been found that low rates that take lesser metabolic cost to produce are 
typically underutilised (?). Similarly, neurons using spike bursts as signals 
underutilise the bursts of one spike that would take lesser production cost 
(??). Theories of cost efficiency cannot explain these experimental results. 
According to the theories of cost efficiency, signalling systems should max- 
imise their capacity to represent different states given a cost constraint or 
maximise the ratio of this representational capacity and the cost (??). The 
optimal distribution for these theories is an exponential decaying with signal 
cost. In this way the most probable signals are those of lowest cost in clear 
contrast to the underutilisation of the low cost signals observed experimen- 
tally. For this reason I consider here the evolution of biological signaUing 
codes towards efficiency of transmission within the biological constraints of 
both cost and errors. 

This paper is organised as follows. Section 2 gives the theoretical frame- 
work and the general result of optimal signal usage when both costs and 
errors constrain the signalling system. To find this optimal signal usage, an 
iterative algorithm that can be easily implemented is given. Section 3 shows 
that the optimal solutions found predict quantitatively the experimental re- 
sults for signal usage in visual cortex neurons. Section 4 gives the conclusions 
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and discusses the application to a variety of biological signalling systems in- 
cluding animal communication for which it is shown that cheaters would shift 
efficient codes to high cost. 

2 Theoretical treatment 

For signal transmission between a signaller and a receiver to work, the 
signaller must use encoding rules that correlate its signalling states C — 
{ci, C2, Cat} with the signals S — {si, S2, sat}- For intercellular sig- 
nalling, the signals S can be different values of concentration of the same 
chemical, different mixtures of several chemicals, different time patterns (say, 
different frequencies of spike generation or bursts of different sizes) , different 
spatial patterns or even different patterns of activation of a group of cells. 
The cellular states C are the internal variables representing the ideal sig- 
nals without errors. Experimentally, identical stimulations of the cell will 
produce a distribution of signals were the peak is the ideal noiseless signal 
corresponding to the cellular state and the variance comes from the errors. 

The correlation of states and signals is subject to the constraints imposed 
by cost and errors. We characterise these errors with the error matrix of 
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conditional probabilities Qkj = p{ck\sj), a matrix given by the probability 
that the signal Sj comes from the state c^. When there are no errors present 
each signal comes from a single state, and the error matrix Q is diagonal. 
When there are errors present, there are nonzero nondiagonal elements. The 
costs can be in molecular machinery (a convenient parameter can be the 
number of ATP molecules), in transmission times (for example, bursts of 
many spikes take longer times to transmit than of fewer spikes) and in risks 
(for example by the use of chemicals that can be toxic). We can formally 
write the costs of producing the signals as e^j with for example €±2 the cost for 
the conversion of the first state into the second signal. As we are interested 
in the signal usage, we refer the costs to the signals as ej — ^^Q^j^kj- We 
always label the signals in order of increasing cost, ei < £2 < ... < e^v- 

We also need to formalise the notion of correlation between the signaller's 
states and the signals in order to consider the consequences of cost and errors 
for this correlation. We require a general measure of correlation that is valid 
for any nonlinear dependencies, unlike correlation functions (?), and that 
does not use a metric that measures correlation in an arbitrary manner. The 
averaged distance between the actual joint distribution p{ci, Sj) and the dis- 
tribution corresponding to complete decorrelation p{ci, Sj)decorr = p{ci)p{sj) 
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gives such a general measure of correlation of the form 



7(C;5) = ^p(Q,s,)log 



( 



PjCj, Sj) 



) 



(1) 



that is zero for the completely decorrelated case and increases with increasing 
correlation. This is the standard measure of statistical correlation used in 
communication theory where it is known as mutual information (?). The 
mutual information / takes care of the errors as a constraint as it decreases 
for an error matrix with larger non-diagonal elements. To see this, we can 
write its expression in (1) in terms of the error matrix Q by separating 
it into the signal variability and the signal uncertainty terms as I{C;S) = 



H{S)-H{S\C) , with H{S) = - Ejp{sj) logpisj) and H{S\C) = E,P(«.)e,- 



a measure of the signal uncertainty for signal Sj and Pjk = p{sj\ck) the 
probability that the state Ck produces the signal sj. We can express Pkj in 
terms of Q using Bayes' theorem as Pjk — {p{sj)Qkj) /(YliiP{^i)Qki) ■ With 
these relations we see that the mutual information can be written as the 
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with 




(2) 
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difference of a term H{S) that measures the variabihty of the signal and a 
term H{S\C) that measures the signal uncertainty as the variability of the 
signal that comes from the errors in Q. This second term H{S\C) is the 
constraint given by the errors. 

Using the mutual information / as the measure of correlation between 
states and signals, that includes the constraint given by the errors together 
with the cost constraint, we can now formulate precisely our problem. With 
which frequencies p{si) should the signals S be used to have a high mutual 
information I between states C and signals S given the errors Q and the 
average cost E = '^iP{si)ei as the biological constraints? To answer this 
question we use the method of Lagrange multipliers (see Appendix). The 
solution of the equations obtained by this method can be found using different 
numerical methods and we have chosen the one given in Algorithm 1 based on 
the Blahut-Arimoto algorithm (??), commonly used in rate distortion theory 
(?), because it is particularly transparent as to the form of the solution. From 
Algorithm 1, we obtain that the optimal signal usage taking errors and cost 
as constraints is of the form in (4) 




(6) 
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Algorithm 1 Optimal signal usage with noise and cost constraints 



Initialise the signal usage to a random vector p-*^. 
for t — 1, 2,... until convergence do 




p\sj)Qkj 



(3) 




exp - {P^ej - J2k Qkj log P^jk) 
E,exp-(/3*e,-E.Q..log/^^)' 



(4) 



where /3* in (4) has to be evaluated for each t from the cost constraint 



where the hat on p, Z, (5 and ^ is a reminder that their values are obtained 
using the iterative Algorithm 1. The expression for ^ is given in (2) and Z is 
the normalisation constant. This solution has a number of interesting charac- 
teristics. Both signal cost, through the term /3ej, and the signal uncertainty 
from the errors ^j , penalise the usage of the signal Sj in an exponential form. 
With no errors present the signal usage is a decaying exponential with the 
signal cost e. And with no cost constraint the signal usage is an exponen- 
tial against the signal uncertainty from the errors ^. The distribution for the 
error-free case coincides with the one obtained in Statistical Mechanics where 



Ej exp - (/3*e,- - Efc Qjk log P^k) 
E,exp-(/3*e,-E.Q.-.logP*,) 



(5) 



end for 
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it is known as the Boltzmann distribution. We name the general distribution 
including the effect of the errors in (6) as a generalised Boltzmann distribu- 
tion. To obtain the general relationship between statistical correlation I and 
average cost substitute the distribution in (6) in the expression for I in 
(1) to obtain / = (3E + logZ, where the parameter /? given in (5) and the 
normalisation constant Z are nonlinear functions of the average cost E. This 
expression is the most general relationship between mutual information and 
cost for efficient signaUing. 

Given the error matrix Q, an average energy E and signals costs e, that 
can be obtained either experimentally or from theoretical models, Algorithm 
1 gives the optimal signal usage that maximizes signal quality while max- 
imizing cost-efficiency. We can advance some characteristics of the signal 
usage for optimal communication. In biological systems we expect that the 
errors produced with highest probability are those with the lowest amplitude. 
Two examples illustrate this point. Consider first a cell that translates some 
states into signals but that when it is in a nonsignalling state, spontaneously 
produces signals by error. The most probable signals to be produced by error 
are those of lowest amplitude and therefore lowest cost. This is the case in 
neurons when different values of spike rates are used as different signals and 
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spontaneous signalling, say following a Poisson distribution, produces the 
highest rates with very low probability The signals of lower rate have then a 
higher signal uncertainty and according to expression in (6) are then under- 
utilized. As a second example consider animal communication. According 
to the present framework, cheaters that can produce low-cost signals enter 
as errors in the communication between healthy animals. These errors make 
the low-cost signals to have higher uncertainty and, as in the case of neuronal 
signalling, according to (6) the low-cost signals should be underutilized. 



3 Comparison with experiments 

The signal usage of a small percentage of neurons, 16% in the case of neurons 
in the visual cortex area MT of macaques (?), can be explained with a theory 
of cost-efficient signalling (??). To explain the signal usage for the totality 
of visual cortex neurons we use the formalism presented in the previous sec- 
tion that not only requires signal efficiency but signal quality. As in (?), 
the present formalism assumes a maximum signal variability with an energy 
constraint, the novelty here is to require also signal quality by minimizing 
signal uncertainty. We also assume that the spike rates are the symbols that 
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the visual cortex neurons use to communicate (???) and that the costs of 
each symbol in ATP molecules can be taken to be linearly proportional to 
the rate value. As a simple model to the main contribution from noise, we 
assume spontaneous signalling when the cell should be in a nonsignalling 
state. This random spike production is modelled by a Poisson distribution, 
with the average number of spikes produced by error in an interval as the 
single parameter that distinguishes different cells. The optimal signal usage 
obtained from the Algorithm 1 for this case can then be approximated as 
(see Appendix) 

p(Rate) = exp (— exp (— Rate/a) — /SRate) , (7) 

where Z is the normalization constant. Cost efficiency is assured by the term 
— /3Rate that penalizes signals by their cost. Signal quality is assured by the 
term exp (— Rate/a) that penalizes signals by their signal uncertainty, that 
increases with a. The predictions made by the optimal signalling in (7) are: 
(a) For high rate values the term required for signal quality in (7) is negli- 
gible, so optimal signal usage reduces to an exponential decaying with rate, 
that is, a straight line in a logarithmic plot, (b) Low rate values are expected 
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to be underutilized respect to the straight hue in (a). Specifically, the dif- 
ference between the straight line in (a) and the logarithm of the probability, 
— /3Rate — log(p) must be a decreasing exponential. We compare these pre- 
dictions to the rate distributions of inferior temporal cortex neurons of two 
rhesus macaques responding to video scenes that have been recently reported 
(?). The experimental distribution of rates for two of the cells (labelled as 
6a001 — 01 and ay 102 — 02 in (?)) are given in Figure 1 using a 400 ms 
window. As seen in Figure 1, the two predictions correspond to the experi- 
mental data. Cost-efficiency is responsible for the signal usage at high rates 
and both cost-efficiency and signal quality for the signal usage at lower values 
of rate. Different neurons may have different values of the average cost and 
different noise properties but the signal usage seems to be adapted to the 
optimal values for each cell. 

4 Discussion 

We have proposed an optimization principle of coding that takes into account 
both the noise and the cost associated with the coding. The outcome of this 
principle is the prediction of the signal usage for efficient signalling systems. 



13 



The optimal signal usage for a communication system constrained by errors 
and cost has been shown to have a generalised Boltzmann form in equation 
(6) that penalises signals that are costly and that are sensitive to errors. 
Noisy signals with low amplitude and therefore low cost are responsible in the 
evolution of signalling systems towards efficiency for a shift of signalling codes 
to higher cost to minimize signal uncertainty. For the simplest case of linear 
costs and low cost noisy signals, the two main features of this optimal signal 
usage are an exponential tail at high cost signals needed for cost efficiency 
and an underutilisation of the low cost signals required to protect the signal 
quality against errors while maintaining the cost efficiency. The predictions 
made by this optimal signal usage have been shown to correspond to the 
experimental measurements in visual cortex neurons. 

We have so far discussed cell signalling, but as we noticed already in the 
Introduction we have chosen this particular type of signalling for concrete- 
ness. The theoretical framework here proposed does not require knowledge of 
the underlying mechanisms of signalling. The theory only uses the notion of 
statistical correlation of states and signals without the need to make concrete 
how this correlation is physically established and without any description of 
the types of signals except for the costs and errors. This is enough to un- 
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derstand the optimal signal usage with cost and error constraints. For this 
reason, the results apply generally to biological communication and also to 
non-biological communication. Intracellular communication and machine- 
machine communication are two possible domains of application. Another 
important case is animal communication for which game-theoretical models 
have predicted that the evolutionary incentive to deceit is overcame increas- 
ing the cost of signals (???). These costly signals are called handicaps and 
make the communication reliable in the sense of being honest. A different 
perspective is gained from the formalism presented here. Cheaters enter in a 
communication as errors in the communication between healthy animals and, 
as they are only able to produce low cost signals, the signal uncertainty of 
the low cost signals is higher. According to the general result in (6) these low 
cost signals should be underutilized by healthy animals for efficient commu- 
nication. This means that signal quality requires a shift to high cost signals, 
as we saw in the case of neurons. In this case, cost can be metabolic, times 
or risks. In this way we obtain a statement of the handicap principle based 
on optimal communication without using the theory of games. Provided we 
know the communication symbols, their cost and error characteristics, the 
present formalism would give the optimal use of symbols according to signal 
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quality and cost-efficiency. In general, a combination of both theories with 
competition elements and signal quality should be used. 

It is interesting to discuss the limits of the theoretical framework. First, 
we have assumed that errors and cost are the only constraints of the com- 
munication system. Although these constraints are universal, particular sys- 
tems might have extra constraints, that can be added straightforwardly to 
the present formalism. However, even in the presence of new constraints, the 
effect of the errors of the low cost signals would be to shift the signaUing code 
to higher cost. Second, we have argued that in biological communication sys- 
tems the errors that are produced with highest probability are those of the 
lowest amphtude and therefore of the lowest cost. For efficient signalling, 
we have seen that the consequence of the noise of low cost signals is to shift 
siganls to a higher cost code. However, it is possible to have a more sophis- 
ticated noise structure that can affect the high cost signals. For example, 
processing of the signals at the receiver cell might fail more frequently for 
the most complex incoming signals, typically those with highest cost. In this 
case, there should be an extra penalisation of the high cost signals and the 
decay of the distribution would be faster than exponential. There is partial 
experimental evidence for this type of code in (?) (see their Figure 4(f,g)). 
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In any case, the general result of the therey presented here is the generalized 
Boltzmann form in (6), that holds for any efficient signalling as it makes no 
assumptions about the noise or cost properties. 
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Apppendix 

The optimization principle proposed consists in maximizing the mutual in- 
formation subject to a cost constraint E — YliiP{^i)^i-i where E is the value 
of the average cost, {ei} are the costs of the different signals and {pi} the 
different probabilities of using the signals. Formally, using the method of 
Lagrange multipliers, we can write this optimization principle as 





where the mutual information is given by 



/ = H{S)-H{S\C), 



(9) 



with the entropy of the signal H{S) given by 




(10) 
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and the entropy of the errors or noise H{S\C) 



Ej,fcP(Sj,Cfc)l0gp(Sj|Cfe) 



can be written as 



H{S\C) = -Y,p{s,)Y,Qk,\ogP,k. 



(11) 
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The matrix Q has elements Qkj = p{ck\sj) given by the probability that the 
signal Sj comes from the state Cfc. The matrix P has elements Pjk — p{sj\ck) 
given by the probability that a state produces the signal Sj, that can be 
written in terms of the probability of finding a signal p{sj) and the matrix Q 
using Bayes' theorem as Pj^ — ip{sj)Qkj)/(^iPisi)Qki)- The optimization 
principle of coding includes both the errors through the error matrix Q (or 
P) and the costs associated with the coding. The general solution to this 
optimization principle is numerical. Before discussing this general numerical 
solution, we consider two particular cases that are analytical. 

All signals with same noise. In this case the entropy of the noise re- 
duces to a constant independent of the probabilities of using different signals, 
H{S\C) — a. The optimization principle gives a result independent of the 
value of a, with the probability of using a signal as an exponetial decreasing 
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with cost (a Boltzmann distribution) 



with the parameter (5 given by the average cost E as 

Ei exp(-/3e,)e, 



Eiexp(-/3ei) 



E. (13) 



For the case in which the cost is an average time T — dTp{T)T, the 
Boltzmann distribution reduces to the Poisson distribution 

p(r) = r-iexp(-r-V). (14) 

A simple noise structure. As a toy analytical model of the results presented 
in this paper, consider a simple case of three signals in which the first two 
require the same cost, ei = €2, and the third one a higher cost £3 > ei and with 
the noise matrix elements p(ci|si) = p(c2|s2) — 1 — p, p(c2|si) = p(ci|s2) = P 
and p{cs\s3) = 1. This toy model has two noisy signals with lower cost and 
a higher cost signal with no noise. The noise entropy for this case has the 
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form 



H{S\C)^{p{s^)+p{s2))C. (15) 

with ^ = — plogp — (1 — p) log(l — p). The optimization principle for this 
example gives the probabilities 

p(si,2) = p(mi,2) = Z"^ exp(-/5ei - (16) 
p{ss) = p{ms) = exp(-/3e3), (17) 

with Z — 2exp(— /3ei — ^) + ex.p{—(3es) the normalisation constant and /3 
given by the value of the average energy 2p(si)ei + ^(53)63 — E. The first 
two signals deviate from the Boltzmann form and are underutilized thus 
preserving signal quality. 

General solution. Differentiating the constrained mutual information and 
equating to zero gives the probability of using a signal of the form 



^^'^ E,exp-(/?6,-E,Q..logP..)' ^ ^ 



but Pjk — {p{sj)Qkj) / (YliiPi^dQki) s-lso depends on p{sj). This creates a 
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nontrivial self-referential problem. However, the optimization of the mutual 
information respect to the probability of using a signal p{si) can be written 
as a double maximization (see Lemma 13.8.1 in ?), that is, the maximization 
of the mutual information. 



This double maximization suggests the possibility of an alternating maxi- 
mization algorithm. Csiszar and Tusnady (?) have shown that an alter- 
nating maximization algorithm for this problem converges to the required 
maximum. The algorithm starts with a guess of an optimal p{si) and with 
that calculates the conditional probability Pij. This conditional probability 
is then used to recalculate a better guess to the optimal p{si) and the pro- 
cedure is continued until convergence. This algorithm, including in our case 
the cost constraint, is given as Algorithm 1 in the main text. 

For the case of a neuron, we would ideally include in Algorithm 1 the ex- 
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can be written as the double maximization 




(20) 
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per iment ally measured values of the noise matrix Q, the signal costs {ej} 
and the average cost E. As these values are not available from experi- 
ments at present, we consider the simplest models for both cost and noise. 
We consider a simple model of cost linearly proportional to the number of 
spikes in the time interval o finterest T, oc i and the noise to be Pois- 
son spontaneous signalling, p(sj|co) = {uTy exp(—i/T)/i\, with u the fre- 
quency of spontaneous signalling and T again the time interval of inter- 
est. For low v this last expression can be approximated by an exponential, 
p{si\co) oc exp(— 7i). Inserting these two approximations into Algorithm 1, 
we find a good correspondence between theory and experiments for all cor- 
tex neurons tested, allowing for a different amount of spontaneous signalling 
parametrized by 7 and for a different average cost E for each neuron. To 
obtain a simple analytical expression common for all cortex neurons, we fit 
the numerical data, or directly the experimental data, with the functional 
form suggested by the theory, p{Rate; a, (3) oc exp{—^{Rate; a) — PRate) to 
find ^{Rate;a) oc exp{—Rate/a) with different values of the parameter a 
depending on the noise of the particular neuron. 



23 



FIGURE CAPTIONS 



Figure 1. The probability distribution of rate usage for visual cortex 
neurons follows the optimal distribution in equation (7) (solid line) with the 
predicted exponential tail (dashed line) for high rates and the underutilisation 
at low costs. The exponential tail makes visual cortex neurons cost efficient 
and the underutilisation of the low cost signals protects their signal quality 
against errors while remaining cost efficient. The errors are responsible for a 
shift to higher cost signals, with a maximum at a rate of value of 10 spikes 
in the 400 ms window instead of at a rate of 1 spike if there were no errors 
present. The experimental data have been taken from the two visual cortex 
neurons labelled as (a) baOOl — 01 and (b) ay 102 — 02 in (?). 
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